An evaluation of Lolita and related natural language processing systems

نویسنده

  • Paul Callaghan
چکیده

An Evaluation of LOLITA and related Natural Language Processing Systems Paul Callaghan Submitted to the University of Durham for the degree of Ph.D., August 1997 ||||||| This research addresses the question, \how do we evaluate systems like LOLITA?" LOLITA is the Natural Language Processing (NLP) system under development at the University of Durham. It is intended as a platform for building NL applications. We are therefore interested in questions of evaluation for such general NLP systems. The thesis has two parts. The rst, and main, part concerns the participation of LOLITA in the Sixth Message Understanding Conference (MUC-6). The MUC-relevant portion of LOLITA is described in detail. The adaptation of LOLITA for MUC-6 is discussed, including work undertaken by the author. Performance on a specimen article is analysed qualitatively, and in detail, with anonymous comparisons to competitors' output. We also examine current LOLITA performance. A template comparison tool was implemented to aid these analyses. The overall scores are then considered. A methodology for analysis is discussed, and a comparison made with current scores. The comparison tool is used to analyse how systems performed relative to each-other. One method, Correctness Analysis, was particularly interesting. It provides a characterisation of task di culty, and indicates how systems approached a task. Finally, MUC-6 is analysed. In particular, we consider the methodology and ways of interpreting the results. Several criticisms of MUC-6 are made, along with suggestions for future MUC-style events. The second part considers evaluation from the point of view of general systems. A literature review shows a lack of serious work on this aspect of evaluation. A rst principles discussion of evaluation, starting from a view of NL systems as a particular kind of software, raises several interesting points for single task evaluation. No evaluations could be suggested for general systems; their value was seen as primarily economic. That is, we are unable to analyse their linguistic capability directly. i Declaration The material contained within this thesis has not previously been submitted for a degree at the University of Durham or any other university. The research reported within this thesis has been conducted by the author unless indicated otherwise. Copyright Notice The copyright of this thesis rests with the author. No quotation from it should be published without his prior written consent and information derived from it should be acknowledged. Acknowledgements First thanks to the Engineering and Physical Science Research Council, and the Defence Research Agency at Malvern for funding my work. Many thanks to those who have helped this thesis on its way, and to those who have made my time whilst working on it so interesting and enjoyable. You know who you are! Particular thanks must go to Roberto Garigliano { the driving force behind LOLITA, and to Rick Morgan, my supervisor. It has been a pleasure to work with you. Finally, I would like to thank my parents, Chris and Tina, for everything. iii It doesn't matter if a cat is black or white, as long as it catches mice. Deng Xiaoping, 1962

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Parallelising a Large Functional Program or: Keeping LOLITA Busy

A parallel version of the LOLITA natural language engineering system is under construction. We believe that, at 47,000 lines of Haskell, LOLITA is the largest non-strict parallel functional program ever. In this paper we report on the ongoing parallelisation of LOLITA, which has the following interesting features common to real world applications of lazy languages: { the code was not speciicall...

متن کامل

Profiling large-scale lazy functional programs

The LOLITA natural language processor is an example of one of the ever-increasing number of large-scale systems written entirely in a functional programming language. The system consists of over 47,000 lines of Haskell code (excluding comments) and is able to perform a wide range of tasks such as semantic and pragmatic analysis of text, information extraction and query analysis. The efficiency ...

متن کامل

Exergoeconomic Evaluation of an Integrated Nitrogen Rejection Unit with LNG and NGL Co-Production Processes Based on the MFC and Absorbtion Refrigeration Systems

Natural gas is often associated with nitrogen and heavy compounds. The Heavy components in the natural gas not only can feed downstream units, owing to the low temperature process may be formed solid as well. Therefore, heavy components separation can be a necessity and produce useful products. Virtually, all natural gases are containing nitrogen ​​that would lower the heating value of natural ...

متن کامل

Presenting a method for extracting structured domain-dependent information from Farsi Web pages

Extracting structured information about entities from web texts is an important task in web mining, natural language processing, and information extraction. Information extraction is useful in many applications including search engines, question-answering systems, recommender systems, machine translation, etc. An information extraction system aims to identify the entities from the text and extr...

متن کامل

A Lacanian Study of Lolita by Vladimir Nabokov

This paper is an attempt to explore how Lacanian concepts of desire, unconscious, as well as alienation are reflected in the major characters of Vladimir Vladimirovich Nabokov’s Lolita. Before unleashing the new, inexplicable yet highly fascinating aspects of psychoanalysis by the advent of French poststructuralist and psychoanalyst Jacques Lacan, Freudian psychoanalysis used to play the pivota...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998